Extended Bic Criterion for Model Selection
نویسندگان
چکیده
Model selection is commonly based on some variation of the BIC or minimum message length criteria, such as MML and MDL. In either case the criterion is split into two terms: one for the model (data code length/model complexity) and one for the data given the model (message length/data likelihood). For problems such as change detection, unsupervised segmentation or data clustering it is common practice for the model term to comprise only a sum of sub-model terms. In this paper it is shown that the full model complexity must also take into account the number of sub models and the labels which assign data to each sub model. From this analysis we derive an extended BIC approach (EBIC) for this class of problem. Results with artificial data are given to illustrate the properties of this procedure.
منابع مشابه
Approximating model probabilities in Bayesian information criterion and decision-theoretic approaches to model selection in phylogenetics.
A priori selection of models for use in phylogeny estimation from molecular sequence data is increasingly important as the number and complexity of available models increases. The Bayesian information criterion (BIC) and the derivative decision-theoretic (DT) approaches rely on a conservative approximation to estimate the posterior probability of a given model. Here, we extended the DT method b...
متن کاملEstimating the Number of Components in a Mixture of Multilayer Perceptrons
BIC criterion is widely used by the neural-network community for model selection tasks, although its convergence properties are not always theoretically established. In this paper we will focus on estimating the number of components in a mixture of multilayer perceptrons and proving the convergence of the BIC criterion in this frame. The penalized marginal-likelihood for mixture models and hidd...
متن کاملXtended Bic Criterion for Model Selection
Model selection is commonly based on some variation of the BIC or minimum message length criteria, such as MML and MDL. In either case the criterion is split into two terms: one for the model (data code length/model complexity) and one for the data given the model (message length/data likelihood). For problems such as change detection, unsupervised segmentation or data clustering it is common p...
متن کاملModel Selection for Mixtures of Factor Analyzers via Hierarchical BIC
Bayesian information criterion (BIC) is a common model selection criterion for mixtures of factor analyzers (MFA). However, it is found that BIC penalizes each factor analyzer implausibly using the whole sample size. In this paper, we propose a new criterion for MFA called hierarchical BIC (H-BIC). Formally, the main difference from BIC is that H-BIC penalizes each factor analyzer using its own...
متن کاملSpeaker segmentation using the MAP-adapted Bayesian information criterion
The Bayesian information criterion (BIC) is a model selection criterion that has previously been applied to speaker segmentation of broadcast news by several researchers. The BIC approach treats speaker segmentation as a model selection problem. As the BIC requires the estimation of the sample covariance matrix, its performance tends to deteriorate as the speaker-turn duration decreases. It is ...
متن کامل